# How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?

## Data Preparation

Note: Data selection and tagging are important in single-concept tuning. We strongly recommend checking the data processing in [sd-scripts]. You should download the corresponding dataset of customized object from Custom Diffusion and DreamBooth.


## Concept incremental LoRA Tuning

### Step 1: Modify the Config

Before tuning, it is essential to specify the data paths and adjust certain hyperparameters in the corresponding config file. Followings are some basic config settings to be modified.

```yaml
datasets:
  train:
    # Concept data config
    concept_list: datasets/data_cfgs/MixofShow/single-concept/objects/real/dog.json
    replace_mapping:
      <TOK>: <dog1> <dog2> # concept new token
  val_vis:
    # Validation prompt for visualization during tuning
    prompts: datasets/validation_prompts/single-concept/characters/test_dog.txt
    replace_mapping:
      <TOK>: <dog1> <dog2> # Concept new token

models:
  enable_edlora: true  # true means ED-LoRA, false means vallina LoRA
  new_concept_token: <dog1> <dog2> # Concept new token, use "+" to connect
  initializer_token: <rand-0.013>+dog
  # Init token, only need to revise the later one based on the semantic category of given concept

val:
  val_during_save: true # When saving checkpoint, visualize sample results.
  compose_visualize: true # Compose all samples into a large grid figure for visualization
```

### Step 2: Start Tuning

We tune each concept with 2 NVIDIA RTX 4090 GPUs. Similar to LoRA, community user can enable gradient accumulation, xformer, gradient checkpoint for tuning on a single GPU.

```bash
sh train.sh
```

### Step 3: Sample


Direct sample image:

```python
import torch
from diffusers import DPMSolverMultistepScheduler
from mixofshow.pipelines.pipeline_edlora import EDLoRAPipeline, StableDiffusionPipeline
from mixofshow.utils.convert_edlora_to_diffusers import convert_edlora

pretrained_model_path = ''
lora_model_path = 'experiments/task_1/models/checkpoint-latest/edlora.pth'
enable_edlora = True  # True for edlora, False for lora

pipeclass = EDLoRAPipeline if enable_edlora else StableDiffusionPipeline
pipe = pipeclass.from_pretrained(pretrained_model_path, scheduler=DPMSolverMultistepScheduler.from_pretrained(pretrained_model_path, subfolder='scheduler'), torch_dtype=torch.float16).to('cuda')
pipe, new_concept_cfg = convert_edlora(pipe, torch.load(lora_model_path), enable_edlora=enable_edlora, alpha=0.7)
extra_args = {'new_concept_cfg': new_concept_cfg} if enable_edlora else {}

TOK = '<dog1> <dog2>'  # the TOK is the concept name when training lora/edlora
prompt = f'a {TOK} in front of eiffel tower, 4K, high quality, high resolution'
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'
image = pipe(prompt, negative_prompt=negative_prompt, height=512, width=512, num_inference_steps=50, guidance_scale=7.5, **extra_args).images[0]
image.save('res.jpg')
```

**Regionally Context-Controllable Synthesis:**

```bash
bash regionally_sample.sh
```

## 📜 License and Acknowledgement

This project is released under the [Apache 2.0 license](LICENSE).<br>
This codebase builds on [diffusers]. Thanks for open-sourcing! Besides, we acknowledge following amazing open-sourcing projects:

- Mix-of-Show.


- Custom Diffusion.


- T2I-Adapter.

